{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h1 align='center'>Environment Canada Weather Data Notebook Demo Part 2</h1>\n",
    "\n",
    "In this note book, we will create an animated heat map to show the changes in monthly temperature throughout Ontario from February 2017 to January 2019. The data is captured identically as it was in part 1 of this series, however, in the interest of time we have aggregated it by month and prepared this data set in advance. This notebook will show you how to use open data to create interesting time-dependent animations of temperature patterns, as well as demonstrate how different ways of filling in missing data between weather stations can _drastically_ affect the presentation of data and potential interpretations of that data. \n",
    "\n",
    "\n",
    "\n",
    "To begin, we import some packages that we will be using for this notebook. We also download our custom helper functions for this notebook in the `notebook_code/map_helpers.py` file."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# need to install packages to create maps!\n",
    "!pip install descartes --user"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "from notebook_code.map_helpers import make_spline, plot_instance, animate_map, exclude_mesh, reanimator\n",
    "\n",
    "import matplotlib.pyplot as plt \n",
    "from IPython.display import HTML\n",
    "import geopandas as gpd\n",
    "import numpy as np\n",
    "import warnings\n",
    "warnings.simplefilter(\"ignore\")\n",
    "\n",
    "%matplotlib inline\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align='center'> Gathering Ontario's Weather Data</h2>\n",
    "\n",
    "In the next cell, we download a csv file of Ontario weather station data that we have aggregated by month. The temperature data has been aggregated such that we only display the average monthly temperature, rather than the houry variations in the previous data set.  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "\n",
    "file_url = \"https://swift-yeg.cloud.cybera.ca:8080/v1/AUTH_233e84cd313945c992b4b585f7b9125d/callysto-open-data/OntarioTempMonthly.csv\"\n",
    "monthly = pd.read_csv(file_url)\n",
    "# Remove superfluous colomn. \n",
    "del monthly['Unnamed: 0']\n",
    "monthly.head()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Find out how many weather stations we have data for\n",
    "len(monthly[\"Station Name\"].unique())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where we see that the first station we have data for is ALGONQUIN PARK EAST GATE, and we have data from 75 weather stations from across Ontario. \n",
    "\n",
    "Before we can create a useful heat map, we need to obtain a map of Canada so we can have a basis on which to create our map. This is done using a shape file of Canada provided by ArcGis at [the following link](https://www.arcgis.com/home/item.html?id=dcbcdf86939548af81efbd2d732336db). This data is downloaded and extracted below. \n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Download shape file from a zip. \n",
    "zip_url = 'https://swift-yeg.cloud.cybera.ca:8080/v1/AUTH_233e84cd313945c992b4b585f7b9125d/callysto-open-data/Canada.zip'\n",
    "import requests, zipfile, io\n",
    "r = requests.get(zip_url)\n",
    "z = zipfile.ZipFile(io.BytesIO(r.content))\n",
    "z.extractall(\"Canada\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Need to get the lat/long tracing of Ontario \n",
    "shape = gpd.read_file('Canada/Canada.shp')\n",
    "# Convert the projection of the map to a more intuitive \"natural\" projection of lat/long\n",
    "shape = shape.to_crs(epsg=4326)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where we notice that our shape file has `POLYGON` entries in the geometry column. These are the Longitude/Latitude coordinates that define the boundaries of the provinces. These are known as \"shape files\" which define the shape of geographic boundaries. We can use pandas to plot this shape file to check that we have the appropriate map"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "shape.plot(figsize = (10,10))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where that certainly looks like we've captured the map of Canada! Our next step is to filter this down to just Ontario for the purposes of creating our animation. This is done as follows"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# NAME is the name of the column header \n",
    "ont = shape[shape.NAME==\"Ontario\"]['geometry']\n",
    "ont.plot(figsize = (10,10))\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where we now have a boundary that readily defines the boundaries of Ontario. \n",
    "\n",
    "<h3 align='center'> Optional Exercise </h3>\n",
    "\n",
    "Try plotting the shape of other provinces and territories in the cell below\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print(\"These are all the names of provinces/territories you can take a peek at \")\n",
    "list(shape.NAME)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "shape[shape.NAME==\"Quebec\"]['geometry'].plot(figsize = (10,10))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The next step to creating an animated heat map is to create an equally spaced grid of Longitude and Latitude coordinates which will allow us to define a \"box\" on which to plot our heat map. This is done quite easily by simply grabbing all the unique latitude and longitude pairs of the weather station data. This is done below. \n",
    "\n",
    "**Note** you can increase the resolution of the animations we're going to create to make more appealing plots. However, note that doubling the resolution multiplies the computational time required by four. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "lang": "en"
   },
   "outputs": [],
   "source": [
    "# Latitude data\n",
    "x = np.array(monthly.groupby('Station Name').mean()['Latitude'].unique())\n",
    "# Longitue data\n",
    "y = np.array(monthly.groupby('Station Name').mean()['Longitude'].unique())\n",
    "\n",
    "# Change this later for higher resolution plots, but you may have to be patient! \n",
    "# Note: This is the number of pixels in the x and y directions we'll create \n",
    "\n",
    "resolution = 50\n",
    "\n",
    "# Now we create an equally spaced list with a number resolution of points within our lat/long range\n",
    "x_grid = np.linspace(x.min(), x.max(), resolution)\n",
    "# Here we're not using the max as there is some \"bad data\" in the data set with incorrect coordinates! \n",
    "y_grid = np.linspace(y.min(), sorted(y)[-2], resolution)\n",
    "\n",
    "x_grid\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align='center'> Wait - a square grid? I thought we were making a map! </h2>\n",
    "\n",
    "Fair point! In this case the reason being is it's _very_ difficult to do 'realistic' heat maps on non-square grids. But that's okay! We can create a mask, or a list of data points that we should remove from our visualization. In our case, let's create a mask which will exclude all points on our grid that don't fall within our map of Ontario. This is done with a function we've abstracted away into a helper file which we call below. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "lang": "en"
   },
   "outputs": [],
   "source": [
    "# This only needs to be run once at any resolution you try. \n",
    "mask = exclude_mesh(x_grid, y_grid, geometry=ont)\n",
    "mask\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where it's really hard to tell what our mask looks like with a bunch of true false values sticking around. However, we can plot our mask to see if it worked, which is done below."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plt.imshow(mask)\n",
    "ax = plt.gca()\n",
    "# This data comes out reversed, so we need to flip it!\n",
    "ax.set_ylim(ax.get_ylim()[::-1])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Where in the plot above, yellow represents true meaning we'll plot data within those pixels, and, what we'll call a dark mauve represents false which means we won't plot data in those pixels. \n",
    "\n",
    "## Creating an Animation"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the cell below we create our animation with the `reanimator` function from our helper file. This will take in our grid and resolution, as well as our mask in order to create the map. \n",
    "\n",
    "**Note:** Creating these animations may take a minute or two, but they are well worth the wait!"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "lang": "en"
   },
   "outputs": [],
   "source": [
    "# Define our figure\n",
    "fig, ax = plt.subplots(figsize=(14,10))\n",
    "\n",
    "#Define which months to plot \n",
    "m_list = list(monthly['UTC'].unique())\n",
    "\n",
    "# change 'radial' to either 'smooth' or 'nearest' to see the differences in interpolation\n",
    "ani = reanimator('smooth', m_list, monthly, x_grid, y_grid, mask, ax, fig)\n",
    "\n",
    "# display the animation \n",
    "HTML(ani.to_jshtml())\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the cell above (or in a new cell) create other animations with different interpolations. Your other options are a radial interpolation called with \n",
    "```python\n",
    "ani = reanimator('radial', m_list, monthly, x_grid, y_grid, mask, ax, fig)\n",
    "```\n",
    "\n",
    "or a nearest neighbor interpolation called with \n",
    "```python\n",
    "ani = reanimator('nearest', m_list, monthly, x_grid, y_grid, mask, ax, fig)\n",
    "```\n",
    "\n",
    "It is currently set to a radial interpolation where the program tries to fill in missing data by assuming radial functions describe how the data should evolve to the next point.\n",
    "\n",
    "## Things to think about:\n",
    "\n",
    "1. When you change the interpolation function, notice how the heat map is incredibly different. Would it be easy to mislead someone by choosing/fine tuning your interpolation function? \n",
    "\n",
    "2. Time permitting, change the resolution of this animation. Does it affect the interpolation and affect your interpretation of the data? \n",
    "\n",
    "2. Is static weather station data the best data source to create animations like this? Is there a reason modern meteorologists rely on satellite and radar data more heavily than weather station data for large-scale predictions? Why or why not? "
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<h2 align='center'> Plotting Daily Averages </h2>\n",
    "\n",
    "Things get a little more interesting if we view a data set with a smaller time aggregation. In the animation below we view November 2017 in 12 hour intervals to see how the temperature has changed according to these 12 hour averages. The process and functions to create the map is the same - we just have to change the data set accordingly. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "file_url = \"https://swift-yeg.cloud.cybera.ca:8080/v1/AUTH_233e84cd313945c992b4b585f7b9125d/callysto-open-data/ontario_november_12H.csv\"\n",
    "\n",
    "day = pd.read_csv(file_url)\n",
    "day.head()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Latitude data\n",
    "x2 = np.array(day.groupby('Station Name').mean()['Latitude'].unique())\n",
    "# Longitue data\n",
    "y2 = np.array(day.groupby('Station Name').mean()['Longitude'].unique())\n",
    "\n",
    "# Change this later for higher resolution plots, but you may have to be patient! \n",
    "# Note: This is the number of pixels in the x and y directions we'll create \n",
    "\n",
    "resolution = 50\n",
    "\n",
    "# Now we create an equally spaced list with 100 points within our lat/long range\n",
    "x_grid2 = np.linspace(x.min(), x.max(), resolution)\n",
    "# Here we're not using the max as there is some \"bad data\" in the data set with incorrect coordinates! \n",
    "y_grid2 = np.linspace(y.min(), sorted(y)[-2], resolution)\n",
    "\n",
    "mask2 = exclude_mesh(x_grid, y_grid, geometry=ont)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Define our figure\n",
    "fig, ax = plt.subplots(figsize=(14,10))\n",
    "\n",
    "#Define which months to plot \n",
    "m_list2 = list(day['UTC'].unique())\n",
    "\n",
    "# change 'radial' to either 'smooth' or 'nearest' to see the differences in interpolation\n",
    "ani2 = reanimator('smooth', m_list2, day, x_grid2, y_grid2, mask2, ax, fig, daily=True)\n",
    "\n",
    "# display the animation \n",
    "HTML(ani2.to_jshtml())\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.6.5"
  },
  "nbTranslate": {
   "displayLangs": [
    "*"
   ],
   "hotkey": "alt-t",
   "langInMainMenu": true,
   "sourceLang": "en",
   "targetLang": "fr",
   "useGoogleTranslate": true
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
